{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 自动求梯度\n",
    "\n",
    "在深度学习中，我们经常需要对函数求梯度（gradient）。本节将介绍如何使用PyTorch提供的`autograd`模块来自动求梯度。如果对本节中的数学概念（如梯度）不是很熟悉，可以参阅附录中[“数学基础”](../chapter_appendix/math.ipynb)一节。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "from torch import autograd"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 简单例子\n",
    "\n",
    "我们先看一个简单例子：对函数 $y = 2\\boldsymbol{x}^{\\top}\\boldsymbol{x}$ 求关于列向量 $\\boldsymbol{x}$ 的梯度。我们先创建变量`x`，并赋初值。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([[0.],\n",
       "        [1.],\n",
       "        [2.],\n",
       "        [3.]])"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = torch.arange(4, dtype=torch.float).reshape(4, 1)\n",
    "x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "为了求有关变量`x`的梯度，我们需要设置`x`的`requires_grad`属性为`True`。\n",
    "\n",
    "\n",
    "**注: 在 PyTorch 中设置一个 `Tensor` 的 `requires_grad` 属性为 `True` 要求这个 `Tensor` 的数据类型必须是 `float`**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "x.requires_grad = True"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "下面定义有关变量`x`的函数。如果一个`Tensor`的`requires_grad`属性被设置为`True`，那么所有依赖于它运算得到的`Tensor`的`requires_grad`属性也会被设置为`True`。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y = 2 * torch.mm(x.t(), x)\n",
    "y.requires_grad"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "由于`x`的形状为（4, 1），`y`是一个标量。接下来我们可以通过调用`backward`函数自动求梯度。需要注意的是，如果`y`不是一个标量，那么需要调用`backward(torch.ones_like(y))`函数自动求梯度。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "y.backward()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "函数 $y = 2\\boldsymbol{x}^{\\top}\\boldsymbol{x}$ 关于$\\boldsymbol{x}$ 的梯度应为$4\\boldsymbol{x}$。现在我们来验证一下求出来的梯度是正确的。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([[ 0.],\n",
       "        [ 4.],\n",
       "        [ 8.],\n",
       "        [12.]])"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "assert (x.grad - 4 * x).norm().item() == 0\n",
    "x.grad"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 训练模式和预测模式\n",
    "\n",
    "默认情况下PyTorch处于训练模式，但在预测模式时，我们不需要对变量进行求导，可以使用`with torch.no_grad():`，该上下文管理器后面进行的计算所得到的`Tensor`都不会保存梯度。\n",
    "\n",
    "**注：也可以通过 `model.train()` 和 `model.eval()` 切换训练模型的两种模式。**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "True\n",
      "False\n"
     ]
    }
   ],
   "source": [
    "print(y.requires_grad)\n",
    "with torch.no_grad():\n",
    "    y2 = 2 * torch.mm(x.t(), x)\n",
    "    print(y2.requires_grad)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在有些情况下，同一个模型在训练模式和预测模式下的行为并不相同。我们会在后面的章节详细介绍这些区别。\n",
    "\n",
    "\n",
    "## 对Python控制流求梯度\n",
    "\n",
    "使用PyTorch的一个便利之处是，即使函数的计算图包含了Python的控制流（如条件和循环控制），我们也有可能对变量求梯度。\n",
    "\n",
    "考虑下面程序，其中包含Python的条件和循环控制。需要强调的是，这里循环（while循环）迭代的次数和条件判断（if语句）的执行都取决于输入`a`的值。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "def f(a):\n",
    "    b = a * 2\n",
    "    while b.norm().item() < 1000:\n",
    "        b = b * 2\n",
    "    if b.sum().item() > 0:\n",
    "        c = b\n",
    "    else:\n",
    "        c = 100 * b\n",
    "    return c"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们像之前一样使用`record`函数记录计算，并调用`backward`函数求梯度。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "a = torch.randn(1)\n",
    "a.requires_grad = True\n",
    "c = f(a)\n",
    "c.backward()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们来分析一下上面定义的`f`函数。事实上，给定任意输入`a`，其输出必然是 `f(a) = x * a`的形式，其中标量系数`x`的值取决于输入`a`。由于`c = f(a)`有关`a`的梯度为`x`，且值为`c / a`，我们可以像下面这样验证对本例中控制流求梯度的结果的正确性。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([1], dtype=torch.uint8)"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.grad == c / a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 小结\n",
    "\n",
    "* PyTorch提供`autograd`模块来自动化求导过程。\n",
    "* PyTorch的`autograd`模块可以对一般的命令式程序进行求导。\n",
    "* PyTorch的运行模式包括训练模式和预测模式。我们可以通过`autograd.is_training()`来判断运行模式。\n",
    "\n",
    "## 练习\n",
    "\n",
    "* 在本节对控制流求梯度的例子中，把变量`a`改成一个随机向量或矩阵。此时计算结果`c`不再是标量，运行结果将有何变化？该如何分析该结果？\n",
    "* 重新设计一个对控制流求梯度的例子。运行并分析结果。\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "## 扫码直达[讨论区](https://discuss.gluon.ai/t/topic/744)\n",
    "\n",
    "![](../img/qr_autograd.svg)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:pytorch]",
   "language": "python",
   "name": "conda-env-pytorch-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.9"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
